Readme:

Good day. I’m roleplaying this exercise as a consultant hired by the Open University officials. I’ve been called upon to analyze the problem of students’ churn based on the demographics, performance and behavioral data. This R Markdown document is created for OU Analytics team’s review only. If you would like to fork the code, make sure you’ve installed all the required packages in the library section. Source of the data can be found here: https://analyse.kmi.open.ac.uk/open_dataset.

#libraries
library(plotly)
library(data.table)
library(dplyr)
library(e1071)
library(ggplot2)
library(caret)
library(ggthemes)
library(rpart)
library(randomForest)

#setting path 
path <- '~/Documents/Masters/Data_science/adobe_challenge/'
source(paste(path, 'src/load.src.r', sep=''))
load.src(paste(path, 'src/', sep = ''))

#Setting seed for session for reproducible results
set.seed(145)

Presentation Flow:

  1. Introduction

Consumer and product analytics is a booming domain as companies have started capturing constomer’s behaviours based on their online activity. Many analytics teams have successfully decoded potential metrics that could define customer’s behaviour to predict chrun, targeted advertisements, marketing campaigns, etc., leading to the increase in the revenue. In our case, our product is the online learning platform and our customers are students. Based on the provided information, I’ve came up with potential recommendations that could help Open University decision makers and analytics team in achieving good results!!

  1. Understanding the data and problem

The OULAD data provided has 7 tables in comma seperated format(csv). For more information on tables and features, please checkout: https://analyse.kmi.open.ac.uk/open_dataset#contacts. Based on the data provided, it’s clear that we’ve 3 types of information available for our problem!